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Abstract. The data analysis problem of coherently searching for unmodeled 
gravitational-wave bursts in the data generated by a global network of 
gravitational-wave observatories has been at the center of research for almost 
two decades. As data from these detectors is starting to be analyzed, a renewed 
interest in this problem has been sparked. A Bayesian approach to the problem 
of coherently searching for gravitational wave bursts with a network of ground- 
based interferometers is here presented. We demonstrate how to systematically 
incorporate prior information on the burst signal and its source into the analysis. 
This information may range from the very minimal, such as best-guess durations, 
bandwidths, or polarization content, to complete prior knowledge of the signal 
waveforms and the distribution of sources through spacetime. We show that 
this comprehensive Bayesian formulation contains several previously proposed 
detection statistics as special limiting cases, and demonstrate that it outperforms 
them. 
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1. Introduction 

Large-scale, broad-band interferometric gravitational-wave observatories [U [21 El [4] 
are operating at their anticipated sensitivities, and scientists around the globe have 
begun to analyze the data generated by these instruments in search of gravitational 
wave signals [5]. Gravitational wave bursts (GWBs) are among the most exciting 
signals scientists expect to observe, as our present knowledge and modeling ability of 
GWB-cmitting systems is rather limited. These signals often depend on complicated 
(and interesting) physics, such as dynamical gravity and the equation of state of matter 
at nuclear densities. While this makes GWBs an especially attractive target to study, 
our lack of knowledge also limits the sensitivity of searches for GWBs. Potential 
sources of GWBs include merging compact objects O [7l [H [9l [TOl [TTl [121 [13l [14] , core- 
collapse supernovae [HI [HI [171 [HI [H] j and gamma-ray burst engines [20] ; see [21] for 
an overview. 

Although a gravitational wave signal is characterized by two independent 
polarizations, an interferometric gravitational wave detector is sensitive to a single 
linear combination of them. Simultaneous observations of a gravitational wave 
burst by three or more observatories over-determines the waveform, permitting the 
source position on the sky to be determined, and a least-squares estimate of the 
two polarizations. This solution of the inverse problem for gravitational waves was 
first derived by Giirsel and Tinto [22] for three interferometers. Subsequent work 
generalized it [53j and formalized it as an example of a coherent maximum likelihood 
statistic by Flanagan and Hughes [7] and later Anderson, Brady, Creighton, and 
Flanagan [24]. Various modifications have been proposed, such as Rakhmanov's 
Tikhonov [25] and Summerscale's maximum-entropy [26] regularization techniques, 
Kilmenko et aVs constraint likelihood method [271 ISH], and the SNR variability 
approach of Mohanty et al. [2H] . Potential as a consistency test was noted by Wen 
and Schutz in [29] and demonstrated in Chattcrji et al [30]. Other coherent detection 
algorithms have been proposed by Sylvestre [31j and by Arnaud et al. |32j . 

These approaches to signal detection have generally been derived by following 
either ad hoc reasoning or a maximum-likelihood criterion (notable exceptions, 
foreshadowing our Bayesian approach, are Finn |33| . Anderson et al [241 and Allen 
et al [34]). In this paper we present a systematic and comprehensive Bayesian 
formulation [35] [36] of the problem of coherent detection of gravitational wave bursts. 
We demonstrate how to incorporate partial or incomplete knowledge of the signal in 
the analysis, thereby improve the probability of detection of these weak signals. This 
information may include time-frequency properties of the signal, polarization content, 
model waveform families or templates, as well as information on the distribution of 
the source through spacetime. We also explicitly identify the prior assumptions that 
must be made about the signal to cause a Bayesian analysis to behave like several of 
the previously proposed detection statistics. 

Real interferometers also experience instrumental artifacts that can masquerade 
as signals. These are typically dealt with in post-processing rather than by the 
detection statistic itself, but some proposals have been made to include a model for 
these "glitches" in the detection statistic [37l [38]. An advantage of the Bayesian 
framework is that the standard choice between signal and stationary noise could 
be extended to include a third option: randomly occurring noise "glitches" . This 
formulation is a promising direction for future progress, but we do not further discuss 
it in this paper. 
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The paper is organized as follows. In SjH wc derive the Bayesian posterior 
detection probability of an idealized delta-like burst signal by a toy-model network 
of observatories. Greatly expanding on |38| . we then generalize the derivation of the 
Bayesian odds ratio into a usable statistic for an arbitrary number of interferometers 
with differently colored and (potentially) correlated noises. We also consider a wide 
range of signal models corresponding to different states of knowledge about the burst, 
from total ignorance to complete a priori knowledge of the waveforms. In ^ we 
characterize the relative performance of previously proposed statistics (the Giirsel- 
Tinto/standard and constraint likelihoods) and the Bayesian statistic by performing a 
Monte-Carlo simulation in which we add a simple binary black-hole merger waveform 
to simulated detector data and construct Receiver-Operating Characteristic (ROC) 
curves. We find that the Bayesian method increases the probability of detection for 
a given false alarm rate by approximately 50%, over those associated with previously 
proposed statistics. 

2. Analysis 

2.1. Single-sample observation 

We begin by investigating perhaps the simplest Bayesian coherent data analysis: 
detecting a signal from a known sky position in a single strain sample from each 
of N gravitational wave observatories. This example will show many of the basic 
features of the Bayesian analysis, and highlight some of the differences between the 
Bayesian approach and previous statistics. In the following section we will generalize 
to a multi-sample search for a signal arriving at an unknown time from an unknown 
sky position. 

Consider a single strain sample from each of N detectors, each measurement taken 
at the moment corresponding to the passage of a postulated plane gravitational wave 
from some known location on the sky, {6, (f>). The measurements are then equal to |22| 



where x is the vector of measurements [xi, . . . jXn]"^ , the matrix F — 
[[Fi',F^], . .. , [F^,F^]] contains the antenna responses of the observatories to the 
postulated gravitational wave strain vector h = [h^, /ix]"^, and e is the noise in each 
sample. F is a known function of the source sky direction (9, (f>), and the decomposition 
into -|- and x polarizations requires us to choose an arbitrary polarization basis angle 
tjj for each source sky direction. 

We wish to distinguish between two hypotheses: Hq, that the data contains only 
noise, and Hi , that the data contains a gravitational wave signal. The Bayesian odds 
ratio [35l [36] allows us to compare the plausibility of the hypotheses: 



where / is a set of unstated but shared assumptions (such as the detector locations, 
orientations and noise power spectra). If the posterior plausibility ratio is greater than 
one. Hi is more plausible than Hq and we classify the observation as a detection. If 
the posterior plausibility ratio is less than one. Hi is less plausible than Hq and we 
classify the observation as a non-detection. 

The p{H\I) terms ("plausibility of H assuming /")are the prior plausibilities we 
assign to each hypothesis H on the basis of our knowledge / prior to considering the 



X = Fh + e. 



(1) 



p(iJi|x,/) _ piHi\I)p{^\Hi,I) 
p{Ho\^,I) p{Ho\I)p{^\Ho,I)' 



(2) 
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measurement; for example, our expectation that detectable gravitational waves are 
rare requires that p{Hi\I) <^ p{Hq\I). 

The p{x.\H,I) terms ("plausibility of x assuming H and /") are the probabilities 
assigned by a hypothesis to the occurrence of a particular observation x. These 
are sometimes called likelihood functions; they represent the likelihood of a certain 
measurement being made. 

The p{H\x,I) terms arc the posterior plausibilities we assign to the hypotheses 
in light of the observation. The hypothesis that assigned more probability to the 
observation becomes more plausible. 

For notational simplicity we will drop the / in our formulae; the unstated 
assumptions are implicit. 

If we make the idealized assumption that the noise in each detector is independent 
and normally distributed |351 136j with zero mean and unit standard deviation, we can 
then write the following expression for the likelihood p{x.\Hq) 

N 

^ 1 1 

i—l 

= (2^)-f exp(-Vx), (3) 

where ^ denotes matrix transposition. For real detectors, the measurements can be 
whitened, which modifies the effective beam pattern functions F. 

If we assume that there is a gravitational wave h present, then after subtracting 
away the response F h the data will be distributed as noise and the likelihood 
p{x\h.,Hi) becomes 

p(x|h, i/i) (2^)-f cxp(-i(x - F h)^(x - F h)) . (4) 

Unfortunately, we do not know the signal strain vector h a priori. To compute 
the plausibility of the more general hypothesis p{'x.\Hsignai) we need to marginalize 
away these nuisance parameters 

/ + OC i< + oc 
/ p(h|i/i)p(x|h,i7i)d/i+d/ix . (5) 
-oo J —oo 

The hypothesis resulting from the marginalization integral is an average of the 
hypotheses for particular signals h, weighted by the prior probability p(h|iJsig„ai) we 
assign to those signals occurring. A convenient choice of prior is to use a normal 
distribution for each polarization, with a standard deviation a indicative of the 
amplitude scale of gravitational waves we hope to detect. Under these assumptions 
the prior is 

^(^'^^^ = 2;^^"P(-2^^''^)- 
This allows us to perform the marginalization integral analytically 

/+00 P+OC 1 
/ cxp(-i((x-Fhf (x-Fh) 

= (2^)-^|I - K..|^ exp(-i x^(I - K..)x) , (7) 
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where 

Kss = F(F^F + (T"2j)-ipT^ (^g^ 

The result is a multivariate normal distribution with covariancc matrix (I — Kgs)^^, 
which quantifies the correlations among the detectors due to the presence of a 
gravitational wave signal. 

With both hypotheses defined, we can form the likelihood ratio 

p(x|i/o) 

= |I - K,,|5 cxp(i x^F(F^F + a-2l)-iF^x) . (9) 

Multiplying the likelihood ratio by the prior plausibility ratio p{Hi)/p{Ho) completes 
the calculation of the Bayesian odds ratio 

In the limit ct — > oo we find that the odds ratio contains the least-squares estimate 
of the strain 

h ^ (F^F)-iF^x. (10) 

The odds ratio may then be rewritten in terms of a matched filter for the response 
to the estimated strain, x-^Fh. For finite values of a, the odds ratio contains the 
Tikhonov regularized estimate of the strain [25] 

h= (F^F + tT-2i)-iF7'x, (11) 

and can still be rewritten as a matched filter for this estimate. 

It is also worth noting the presence in of the determinant |I — K.ss\ factor. It is 
independent of the data and depends only on the antenna pattern and the signal model. 
In particular, it tells us how strongly to weight likelihoods computed for different 
possible sky positions of the signal. This Occam factor penalizes sky positions of high 
sensitivity relative to sky positions of lower sensitivity which give similar exponential 
part of the likelihood. The effect is typically small compared to the exponential in 
most cases if the data has good evidence for a signal, but can be important for weak 
signals and for parameter estimation. 



2.2. General Bayesian model 

We now generalize the analysis of the previous section to the case of burst signals of 
extended duration and unknown source sky direction (0, 0) and arrival time r with 
respect to the centre of the Earth. 

A global network of N gravitational wave detectors each produce a time-series of 
M observations with sampling frequency /s, which we pack into a single vector 

X = [a;i,i,a;i,2, . . . ,xi^m,X2a,X2,2, ■ ■ ■ , X2,m, ■ ■ ■ ,xn,i,xn,27 ■ ■ ■,xn^m]'^ ■ (12) 
Our signal model is a generalization of ([1]), 

X = F(t, 6*, (/)) • h e , (13) 

where 

h = [h+^i, h+,2, . . . , /i+.L, /ix4' • ■ • ' ^x,l]^ (14) 

is a time-series of 2L samples describing the band-limited strain waveform (with the 
two polarizations packed into a single vector), e is a random variable representing 
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the instrumental noise, and F(r, 6*, 0) is a NM x 2L response matrix describing the 
response of each observatory to an incoming gravitational wave, 

F+{0, <P)T{t + Ari(0, 0)) F,^ {9, 0)T(t + An (0, cf,)) 
F+ie, 0)T(t + Ar2(0, 0)) F,^ [9, 0)T(t + At2((?, ^)) 



F(t,( 



(15) 



^^+(^^, 0)T(t + At^(0, 0)) ^^^(0, </))T(r + Atjv(0, 0)) 

Each M X L block of the response matrix is responsible for scaling and time shifting 
one of the waveform polarizations for one detector, so each block is the product of 
the directional sensitivity of each detector to each polarization, F^{9, </>) or F^^ (6*, (f>), 
and a time delay matrix rj.fc(t)[||, for the source sky direction dependent arrival times 
T + ATi(9, 6) at each detector. 



2.3. Noise model 

The noise that affects gravitational wave detectors is typically modeled as stationary, 
colored gaussian noise that is independent of the signal parameters. This can be 
represented with a multivariate normal distribution, which can be compactly written 
as 

AA(m, S,x) = ^^^^^/2^ exp(-i(x - m)^S-1(x - f,)) . (16) 

The vector fi is the mean of the distribution, and the positive-definite covariance 
matrix S describes the ellipsoidal shape of the constant-density contours of the 
distribution in terms of the pairwise covariances of the samples, 

^ {{e,- fi,),{ej - fij)) . (17) 

Using this notation, the noise likelihood is 

p(x|iJo)-AA(0,S,e) (18) 

for some MN x AIN positive definite matrix S. Under the additional assumption 
of stationarity over some timescale, these covariances can be estimated from previous 
observations. 

In the case of Gaussian stationary colored noise, each detector is individually 
represented by a Toeplitz covariance matrix S*-*^. For uncorrelated noise, the 
covariance matrix for the whole network is S = diag(S(i), S^^), . . . , S^^)). In the 
simple case in which all the noises are white, have equal standard deviation and are 
uncorrelated, we have S = diag(I, !,...,!) =1. 

The generalization of ^ and ^ for the signal likelihood is 

Kxli/i)- [ AA(F(t,0,0) •h,S,x)p(h,r,0,</)|ifi)dh...d0 , (19) 

where Vh,r,e,0 is the space of all signal parameters and p(h, t, 9, is the prior for 

these parameters. Without loss of generality we may separate this signal prior into a 
prior on source sky direction and arrival time, and a prior on the waveform conditional 
on the source sky direction and the arrival time, i.e. 

p{h, T, 9, <p\Hi) = p{T, 9, ^\Hi)p{h\T, 9, 0, Hi) , (20) 

J From the assumption that the signal is band-limited, it follows that the time delay matrix may be 
written as Tj j^it) = sinc(7r(j — k — /st)); for L = M and zero time delays, it is equal to the identity 
matrix; for L = M and time delays corresponding to integer numbers of time samples, it is a shift 
matrix. 
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giving 

p(x|Fi)= f Af{F{T,e,cb)-h,-S,^)p{T,e,cj)\H,)p{h\T,0,cf,,Hi)dh...dcf, . (21) 
2.4- Wideband signal model 

In analogy with the single sample case, we can choose a multivariate normal 
distribution prior for the waveform amplitudes and render the integral soluble in closed 
form. The marginalization integral over h in (|2ip can then be analytically performed, 
giving 

p(x|r,0,0,gi) _ 4,,AA(F(T,g,0) ■h,S,x)p(h|T,^,(^,Hi)dh 

p(x|i?o) AA(0,S,x) ^ > 

(see pop below). Numerical integration over a more manageable three dimensions is 
then sufficient to compute the Bayes factor, 



p(x|i/o) J J J ' pi^Wo) 

This signal model is computationally tractable. It represents signals that can be 
described by an invertible 2L x 2L correlation matrix, including the important 'least 
informative' case of independent, normally distributed samples of h. 



2.5. Informative signal models 

The wideband signal model excludes some important cases, such as when we have 
a known waveform, almost known waveform (such as from a family of numerical 
simulations) or even just a signal restricted to some frequency-band. These signals 
are superpositions of a (relatively) small number G < 2L of basis waveforms, that 
may themselves be characterized by a finite number of parameters, which we denote 
p. These parameters must be numerically integrated, like r, 0, and 0, which may be 
time-consuming. Their prior distribution will be denoted by p(p|t, 0, 0, Hi). 

To describe the signal as a superposition of basis waveforms |39j , define a set of 
amplitude parameters a mapped into strain h via a 2L x G matrix W(p, r, 9, 0) whose 
columns Wi(p, r, 0, 0) arc the basis waveforms, so that 

h = W(p,r,0,0).a. (24) 

We assume that the amplitude parameters a are multivariate normal distributed with 
a covariance matrix A(p, r, 6, 0), so that 

p(a|p, T, 0, 0, i/i) = ^(0, A(p, r, 9, 0),a) . (25) 

The resulting distribution for the waveform strain is 

p(h|r,6l,0,i?i) = / / p(h|a, p, r, 61, 0,i?i)p(a,p|r, 61, 0,i7i) da dp 
JVp JR'^ 

= [ [ <5(h- W-a)AA(O,A,a)p(p|T,6',0,i/i)dadp, (26) 

where for clarity we have begun to omit the dependence of matrices on their 
parameters. As G < 2L {i.e., we have fewer basis waveforms than samples in the 
signal time-series) the integral over a cannot be directly represented as a multivariate 
normal distribution. 
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This signal model proposes that gravitational wave signals have waveforms that 
are the sum of G basis waveforms with amplitudes that are normally distributed (and 
potentially correlated). The basis waveforms and their amplitude distributions may 
vary with source sky direction, arrival time, and any other parameters we care to 
include in p. The model is capable of representing a variety of sources including the 
important special cases of known 'template' waveforms, and band-limited bursts. We 
will consider some concrete examples in §2.6) perhaps the most important is a scale 
parameter cr, that permits us to look for signals of different total energies. 

We can substitute the expression back into part of to form a multivariate 
normal distribution partial integral whose solution is given in [35| : 

p{x\T,9,(j),Hi) =11 AA(F-h,S,x)(5(h- W-a)AA(0,A,a) 

JVp Jrg + 2l 

X p{p\t, 9, (j), i7i) dad/9 dh 
= / AA(O,(S-i-K)-i,x)p(p|T,0,<^,Hi)dp, (27) 

where the matrix 

K(p,T,6',^) = (S-iFW)((FW)^S-iFW + A-1)-\S-1fW)'^ (28) 

will be the kernel of our numerical implementation. Note that this is a generalization 
of equation ([5]) obtained in the single-sample case. Since 

p(x|p,T,g,0,gi) AA(0,(S-^-K)-i,x) . 1 J. 

Kxlffo) = m:^) = V|I-SK|exp(-x Kx) , (29) 

we have 
p(x|r,g,0,gi) 

and the Bayes factor becomes 

4414= / P(P, T, 9, mi)V\i - SKI exp(ix^Kx) dpdr dO dcf> . (31) 

In other words we have reduced the task of computing the Bayes factor to an integral 
over arrival time, source sky direction, and any additional signal model parameters p. 

2.6. Example signal models 

A simple signal model is the wideband signal model discussed briefly in Section [^Hl 
This is a burst whose spectrum is white, has characteristic strain amplitude a (at the 
Earth) and duration fg^L 

G =2L (32) 

A =cr^I (33) 

W = I . (34) 

If we assert that such bursts are equally likely to come from any source sky 
direction and arrive at any time in the observation window of f^^M seconds, then 
the priors are 

p{9\H^) =\sm{9) (35) 

p(0|iJi) = (27r)-i (36) 
p{t\H^) = /sAf-i . (37) 



/ p{p\T, 9, Hi) v/|I - SK| exp(ix^Kx) dp . (30) 
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If we assert that the sourec population is distributed uniformly in flat spaee up to some 
horizon Tmax we have a prior on the distance r to the source p(r\Hi) oc r^. We want to 
turn this into a prior on the characteristic amplitude cr, an example of a signal model 
parameter we must numerically marginalize over (p = [a]). Since the gravitational 
wave energy decays with the square of the distance to the source, cr^ cx r"^, we then 
deduce that: 



where oc r^l^y. is a lower bound on the amplitude of (or upper bound on the 

distance of) the gravitational wave. This bound is obviously somewhat arbitrary, 
but is a consequence of the way we distinguish between detection and non-detection. 
For a uniformly spatially distributed population of bursts there are of course many 
weak signals within the data, and the noise hypothesis is "never" true. In reality we 
are interested only in gravitational waves of at least a certain size. If fTmin is much 
smaller than the noise floor in all detectors, the expression for the noise hypothesis 
is an excellent approximation to the expressions of the likelihood we adopted. The 
classification of observations is insensitive to different choices of cTmin below the noise 
floor. 

This distribution of a is preserved if we consider a source population with a 
distribution of different intrinsic luminosities, so long as they are uniformly distributed 
in space out to their respective r^ax determined by the choice of CTmin- 

This is an example of a relatively uninformative signal model. It is capable of 
detecting signals of any waveform (of appropriate duration). However, it incurs a large 
Occam penalty for its generality, and cannot be as sensitive as a more informed search. 

The other extreme situation is where a source's waveform is completely known, 
but its other parameters (amplitude, source sky position, polarization angle) are 
not. Consider a source that produces a linearly polarized strain w. If the source's 
orientation, inclination and amplitude are unknown, we can parameterize the system 
with two amplitudes a mapping the strain into the observatory network's polarization 
basis 



This is the Bayesian equivalent of the matched filter. The template w appears twice 
because any specific signal typically will not be aligned with the polarization basis 
used to describe /i+ and in the detectors, but rather will be rotated by some 
polarization angle with respect to that basis. More generally, any signal model 
that is independent of the observatory network's polarization basis must have A and 
W composed of two identical sub-matrices on the diagonal like this, so that h+ and 
hx have the same statistical distribution. For example, if the source is not linearly 
polarized, but has strain described by w+ and Wx, then 



p[a\Hi) = p{r\Hr) — 



(38) 




(39) 



W = 



W_(- Wx 













(41) 



W+ Wx 



A more general case might be where we have a number of different predictions for a 
waveform, Wj, numerically derived. The resulting search looks for a linear combination 
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of these different waveforms, 

Wi W2 • • • 



W 



• • • Wi W2 



(42) 



2. 7. Comparison with previously proposed methods 



In this section we will expand on the arguments sketched in a previous paper |38| . 

Several previously proposed hypothesis tests, such as the Giirsel-Tinto (i.e. 
standard likelihood), the constraint likelihoods, and the Tikhonov-regularized 
likelihood, can be written in the form 

max x"^J(p, r, 9, (f>)x > A , (43) 

p,r,6,<p 

where J is an MNxMN matrix and A is a threshold. These tests proceed in two steps. 
First, parameters are estimated by maximizing the likelihood function with respect 
to the parameters. Second, the value of the likelihood function at its maximum is 
compared to a threshold A, which is chosen to ensure that it is only exceeded for the 
noise hypothesis at some acceptable false alarm rate. 

The corresponding Bayesian expression, from pip , integrates over source sky 
direction, arrival time and any other parameters and determines if the Bayes factor is 
large enough to overcome the prior plausibility ratio 

/ p{p, T, e, mi)V\i - SKI exp(ix^Kx) dpdr d0 d0 > • (44) 

Jv.^^.s., 2 p{Hi) 

There are some obvious similarities between P5)) and pi)) . in particular the 
quadratic forms central to each. However, direct mathematical equivalence cannot 
be established in general because of the difference between maximization and 
marginalization. 

We can establish equivalence for the related problem of parameter estimation, 
where we have maximum likelihood parameter estimate 

{p,T,6,(p} ^ argmax(x"^Jx) (45) 

and the Bayesian most plausible parameters, one of several ways the posterior 
plausibility distribution for the parameters can be turned into a point estimate 

{p,T,9,^} = argmax(p(p,r,0,0|i/i)v/|l- 5]K|exp(ix^Kx)) (46) 

argmax(x^Kx + 21np(p,r,6',(/)|iJi)+ln|I-SK|). (47) 

In the cases where we can find a Bayesian signal model that produces K = J, we must 
also use a prior 

p(p,r,0,0|iJi)cx |I-SK|-i (48) 

This prior states that gravitational wave bursts are intrinsically more likely to occur 
at the sky positions that the network is more sensitive to. We interpret this as an 
implicit bias present in any statistic of the form of (|43|l ?l. 

In order to compare previously proposed statistics to the Bayesian method, we 
place some restrictions on the configurations considered. We will assume co-located 

§ It is important to note that this particular objection applies only to all-sky searches; it is a 
consequence of the maximization over {9,<j>). These statistics are also used in directed searches (for 
example, in the direction of a gamma-ray burst) where (6, (p) is known and fixed, and the problem 
does not arise (the missing normalization term is one of several absorbed by tuning the threshold) . 
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(but differently oriented) detectors to eliminate the need to time-shift data, and we 
will use stationary signals and observation times that coincide with the time the 
signal is present. These restrictions eliminate the differences in the way previously 
proposed statistics and the Bayesian method handle arrival time and signal duration. 
For simplicity, we will further assume that the detectors are affected by white Gaussian 
noise. The conclusions drawn will apply equally to different versions of these statistics 
for colored noise or different bases other than the time-domain (such as the frequency 
or wavelet domains). 

2.7.1. Tikhonov regularized statistic The Tikhonov regularized statistic proposed in 
|25| for white noise interferometers is 

x^F{F'^F + a'^iy^F'^x. (49) 

The Bayesian kernel K reduces to this for 

S =1 (50) 

W = I (51) 

A = Q'^^I. (52) 

This is a signal of characteristic amplitude a = a^^. The Tikhonov regularizer a 
therefore places a delta function prior on the characteristic amplitude of the signal 
p{a\Hi) = S{a - a-^). 

The Tikhonov statistic behaves like a Bayesian statistic that postulates all bursts 
have energies in a narrow range. 

2.7.2. Giirsel-Tinto statistic The Giirsel-Tinto or standard likelihood statistic 

mEnn] is 

x^F(F^F)~^F^x. (53) 

For large cr, the Tikhonov statistic goes to 

K«F(F^F)-iF^. (54) 

This implies that the Giirsel-Tinto statistic is the limit of a series of Bayesian statistics 
for increasing signal amplitudes. 

2.7.3. Soft constraint likelihood The soft constraint statistic [571 HO] for white noise 
interferometers is 

fc2(6',</))x'^FF^x, (55) 

for some function k{d,(j)). Specifically, ((55|) gives the soft constraint likelihood for the 
choice k^ = (F+"^F+)~^, where the antenna response is computed in the dominant 
polarization frame [57]. 

Consider the signal model defined by 

S =1 (56) 
W = I (57) 
A = cr2fc2(6»,0)l. (58) 

This is a population of signals whose characteristic amplitude ak{9, 0) varies as some 
known function of source sky direction, slightly generalizing the situation of the 
Tikhonov statistic. For small cr, 

K K.a^k^ {9, (fyFF'^ , (59) 
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so we can see that the soft constraint is the limit of a series of Bayesian statistics for 
decreasing signal amplitudes. 



2.7.4- Hard constraint likelihood Let us restrict the soft-constraint signal model to 
a population of linearly polarized signals with a known polarization angle "0(0, 0) for 
each source sky direction 



W = 



(60) 
(61) 
(62) 



S =1 

' cos2V'(6',0)I 
sin 27/^(6*, (/))I 

Then for cr ^ the Bayesian statistic limits to 

fc2(e»,^)x^FW(FW)^x. (63) 

For the particular choice of ip{9, 0) being the rotation angle between the detector 
polarization basis and the dominant polarization frame, and = (FW)^FW (which 
is equal to (F+-^F+)~^ in the dominant polarization frame [27]), this yields the hard 
constraint statistic of [27] , 

In addition to the explicit assumptions that all signals are linearly polarized with 
known polarization angle, the hard constraint has the same properties as the soft 
constraint. 



2.8. Interpretation 

We have shown that several previously proposed statistics are special cases or limiting 
cases of Bayesian statistics for particular choices of prior. The 'priors' implicit in 
these non-Bayesian methods are not representative of our expectations about the 
source population, so we can reasonably expect improved performance from a detection 
statistic with priors better reflecting our state of knowledge. The Bayesian analysis 
allows us to begin with our physical understanding of the problem, described in terms 
of prior expectations about the gravitational wave signal population, and derive the 
detection statistic for these conditions. The effects of priors are lessened when there 
is a strong gravitational wave signal present; all these statistics, Bayesian and non- 
Bayesian, are effective at detecting stronger gravitational waves; significant differences 
occur only for marginal signals. In the next section, we will quantitatively compare 
the relative performance of the methods mentioned above and the Bayesian statistic 
we propose. 



3. Simulations 



To characterize the relative performance of the Giirsel-Tinto (i.e. standard likelihood), 
soft constraint, hard constraint, and Bayesian methods we used the X-Pipeline 
software package [H]. This package reads in gravitational wave data, estimates the 
power spectrum and whitens the data, and transforms it into a time-frequency basis of 
successive short Fourier transforms. Each statistic is then applied to the transformed 
data, and the results saved to file. This ensures that the observed differences are due 
to the statistics themselves, and not to different whitening or other conventions. 

Our tests used a set of 4 identical detectors at the positions and orientations of 
the LIGO-Hanford, LIGO-Livingston, GEO 600, and Virgo detectors. The data was 
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Figure 1. Timc-serics Lazarus waveforms |43| used for our simulations, from a 
nominal distance of 240 Mpc. 



simulated as Gaussian noise with spectrum following the design sensitivity curve of 
the 4-km LIGO detectors; it was taken from a standard archive of simulated data [42] 
used for testing detection algorithms. Approximately 12 hours of data in total was 
analysed for these tests. 

For the population of gravitational-wave signals to be detected we chose, 
somewhat arbitrarily, the "Lazarus" waveforms of Baker et al. [43] . These are fairly 
simple waveforms generated from numerical simulations of the merger and ringdown 
of a binary black-hole system. We chose to simulate a pair of 20 solar-mass black 
holes, which puts the peak of signal power near the frequencies of best sensitivity for 
LIGO. The time-series waveforms arc shown in Fig. [1] while the spectra and detector 
noise curve are given in Fig. [2l The sources were placed at the discrete distances 
240/S' Mpc[j]], where S = 1,2,2.5,3,10, and with randomly chosen sky position and 
orientation. Approximately 5000 injections were performed for each distance. 

For the Bayesian statistic we adapt the broad-band signal prior ([S^ - fM]) for each 
polarization. Specifically, we assume the simple model of a burst whose spectrum is 
white, with characteristic strain amplitude a at the Earth and duration equal to our 

II The fiducial distance 240 Mpc is chosen for rmmcrical convenience; it is the distance at which the 
sum-squared matched-filter SNR for each polarization is 1/2, assuming optimal antenna response 
(F+,F>< = 1). 
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frequency (Hz) 



Figure 2. Strain-equivalent noise amplitude spectral density of the simulated 
data used in our tests (black) with spectra of the Lazarus waveforms I43| at 240 
Mpc. The Lazarus spectra have been rescaled by T^^/^ = (128/sec)^/^ to render 
them into the same units as the noise spectrum. 



chosen FFT length: 

G = 2M (64) 

A =(T^I (65) 

W = I. (66) 

Wc use a uniform prior on the signal arrival time r. and an isotropic prior on the sky 
position (0, 0). The Bayesian statistic was computed for approximately logarithmically 
spaced discrete values of characteristic strain a — 10~^^,3 x 10"^'^, 10"'^^, 3 x 
10~^^, 10~^^, and averaged together in post-processing. This averaging approximates 
a single Bayesian statistic with a Jeffreys (scale invariant) prior p(cr) oc 1/cr between 
10"^"^ and 10~^^. (Performing the combination in post-processing allowed us to 
maintain compatibility with the existing architecture of X-Pipeline; we do not use 
the prior from (|39p because we are injecting from a fixed distance, not a spatially 
uniform population.) 

Each likelihood statistic was computed over a fixed frequency band of [64.1088] 
Hz, with an FFT length of 1/128 sec. We analyzed blocks of data, overlapping by 75% 
of their duration. The detection probability as a function of false alarm probability 
is shown in Fig. [31 The distance used for the Lazarus simulations for this figure 
was 240/2.5 = 96 Mpc; injections at other distances yielded similar results. At this 
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Table 1. Distances for false-alarm probability 10 ^ 



Statistic 


Distance (Mpc) 


Distance (rel.) 


Volume (rel.) 


Standard 


72.1 


1.01 


1.02 


Soft 


71.6 


1.00 


1.00 


Hard 


72.5 


1.01 


1.04 


Bayesian 


82.0 


1.15 


1.50 



distance, the total SNR deposited in the network yJ2a SNR was in the range ^1 — 8 
with a mean value of 5, where 

?'™° = ?*/* (67) 

and S{f) is the one-sided noise power spectral density of each interferometer. 

Fig. [3] is the receiver-operating characteristic plot for each of the statistics 
considered. The vertical axis represents the fraction of a population of signals whose 
detection statistics exceed the threshold that would only be crossed by background 
noise at the rate given by the false alarm probability on the horizontal axis. For 
example, we can read off the figure that if we can afford a false alarm probability 
of 10~^, the various detection statistics are able to detect between 0.4 and 0.6 of the 
injected signals. We see that the best performance is achieved by the Bayesian method 
with the a value most closely matching the injected signals, with the marginalized 
curve performing almost identically. The detection probability of the marginalized 
Bayesian method is significantly better than that of any of the non-Bayesian methods 
(standard likelihood, soft constraint, and hard constraint likelihoods) over the full 
range of false-alarm probabilities tested. 

For a given false-alarm probability, we may compute the distance at which 
each likelihood achieves 50% efficiency by fitting a sigmoid curve to the simulations. 
The observed volume, and therefore the expected rate of detections for a uniformly 
distributed source population, scales as the cube of the distance. We computed the 
distance and volume for each statistic for two false alarm probabilities, 10~^ in Table[T] 
and 1/256 « 3.9 x 10"'^ in Tabled As we compute 512 statistics per second, these 
correspond to false alarm rates of 1/200 Hz and 2 Hz respectively (as the statistics 
are computed on 75% overlapped data, these estimates are conservative). These rates 
are practical for event generation at the first stage of an untriggered (all-sky, all-time) 
burst search [HI [45l O [46] . At both of these false alarm probabilities, the Bayesian 
method can detect sources approximately 15% more distant, and consequently has an 
observed volume and expected detection rate approximately 50% greater, than the 
non-Bayesian statistics. 

It is important to note that we do not use detailed knowledge of the signal 
waveform for the Bayesian analysis. Our prior is that the signal spectrum is flat 
over the analysis band ([64,1088] Hz), and by imposing no phase structure or sample- 
to-sample correlations we are assuming that over the integration time (1/128 sec) 
the time samples of strain are independently and identically distributed. Considering 
Figures [Hand [21 it is clear that these priors are not particularly accurate models for the 
actual gravitational-wave signal. Nevertheless, our Monte Carlo results demonstrate 
that even this incomplete prior information can improve the sensitivity of the search. 
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false alarm probability 

Figure 3. Receiver-operating cliaracteristic (ROC) curves for the Bayesian, 
standard (Giirsel-Tinto), soft constraint, and hard constraint likelihoods for 
sources at 96 Mpc. The curve for a prior is obtained by marginalizing over 
probabilities associated with the discrete a values tested. The best performance 
is achieved by the cr value most closely matching the amplitude of the injected 
signals, with the marginalized curve performing almost identically. The detection 
probability of the marginalized Bayesian method is significantly greater than that 
of any of the non-Bayesian methods (standard/Giirsel-Tinto, soft constraint, and 
hard constraint likelihoods) over the full range of false-alarm probabilities tested. 



Table 2. Distances for false-alarm probability 1/256 



Statistic 


Distance (Mpc) 


Distance (rel.) 


Volume (rel.) 


Standard 


87.1 


1.03 


1.08 


Soft 


84.9 


1.00 


1.00 


Hard 


86.9 


1.02 


1.07 


Bayesian 


97.9 


1.15 


1.53 
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4. Conclusions and Future directions 

We have presented a comprehensive Bayesian formulation of the problem of detecting 
gravitational wave bursts with a network of ground-based interferometers. We 
demonstrated how to systematically incorporate prior information into the analysis, 
such as time- frequency or polarization content, source distributions and signal 
strengths. We have also seen that this Bayesian formulation contains several previously 
proposed detection statistics as special cases. 

The Bayesian methodology we have derived to address the problem of detecting 
poorly-understood gravitational wave bursts yields a novel statistic. On theoretical 
grounds we expect this statistic to outperform previously proposed statistics. A 
Monte-Carlo analysis confirms this expectation: over a range of false alarms rates, the 
Bayesian statistic can detect sources at 15% greater distances and therefore observe 
50% more events. 

The Bayesian search requires explicitly adopting a model for the poorly 
understood signal. This is not a shortcoming. The model may be agnostic with respect 
to many features of the waveform. As we have demonstrated, existing methods are 
not free from their own signal models, but implicitly assume priors on the energies 
and spatial distribution of sources. 

Coherent analyses of any kind are relatively costly, and efficient implementations 
must be sought. By specifying the problem as an integral, the Bayesian approach 
lets us leverage the extensive literature on numerical integration for techniques to 
accelerate the computation; one promising contender is importance sampling. 

As second practical issue that must be dealt with is that the background noise of 
real gravitational wave detectors contains transient non-Gaussian features ( "glitches" ) . 
As in the case of detection statistics, several ad hoc non-Bayesian statistics have 
been proposed to distinguish glitches from gravitational wave signals (see for example 
[30l [29]). Again, Bayesian methodology provides us with a direction in which to 
proceed: augment the noise model to better reflect "glitchy" reality, and to the extent 
we are successful, robustness will automatically follow. 
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